# Transformer encoder

Dinov2 Giant
Apache-2.0
A vision Transformer model trained using the DINOv2 method for self-supervised image feature extraction
Image Classification Transformers
D
facebook
117.56k
41
Dinov2 Base
Apache-2.0
Vision Transformer model trained using the DINOv2 method, extracting image features through self-supervised learning
Image Classification Transformers
D
facebook
1.9M
126
Vit Large Patch32 384
Apache-2.0
This Vision Transformer (ViT) model is pre-trained on the ImageNet-21k dataset and then fine-tuned on the ImageNet dataset, suitable for image classification tasks.
Image Classification
V
google
118.37k
16
Vit Huge Patch14 224 In21k
Apache-2.0
A Vision Transformer model pretrained on ImageNet-21k, featuring an extra-large architecture suitable for visual tasks like image classification.
Image Classification
V
google
47.78k
20
Ruroberta Large
A Russian RoBERTa large model pre-trained by SberDevices team with 355 million parameters, trained on 250GB of Russian text
Large Language Model Transformers Other
R
ai-forever
21.00k
45
Vit Large Patch32 224 In21k
Apache-2.0
This Vision Transformer (ViT) model is pre-trained on the ImageNet-21k dataset and is suitable for image classification tasks.
Image Classification
V
google
4,943
1
Vit Large Patch16 384
Apache-2.0
Vision Transformer (ViT) is an image classification model based on the transformer architecture, pre-trained on ImageNet-21k and fine-tuned on ImageNet.
Image Classification
V
google
161.29k
12
Vit Large Patch16 224 In21k
Apache-2.0
A Vision Transformer model pretrained on the ImageNet-21k dataset, suitable for image feature extraction and downstream task fine-tuning.
Image Classification
V
google
92.63k
26
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase